Search results

1 – 4 of 4
Article
Publication date: 20 August 2018

Chang-Sup Park

This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees…

Abstract

Purpose

This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees obtained by previous approaches based on distinct root semantics. The previous approaches are restricted to find answer trees having different root nodes and thus often generate a result consisting of answer trees with low relevance to the query or duplicate content nodes. The method allows limited redundancy in the root nodes of top-k answer trees to produce more effective query results.

Design/methodology/approach

A measure for redundancy in a set of answer trees regarding their root nodes is defined, and according to the metric, a set of answer trees with limited root redundancy is proposed for the result of a keyword query on graph data. For efficient query processing, an index on the useful paths in the graph using inverted lists and a hash map is suggested. Then, based on the path index, a top-k query processing algorithm is presented to find most relevant and diverse answer trees given a maximum amount of root redundancy allowed for a set of answer trees.

Findings

The results of experiments using real graph datasets show that the proposed approach can produce effective query answers which are more diverse in the content nodes and more relevant to the query than the previous approach based on distinct root semantics.

Originality/value

This paper first takes redundancy in the root nodes of answer trees into account to improve the relevance and content nodes redundancy of query results over the previous distinct root semantics. It can satisfy the users’ various information need on a large and complex graph data using a keyword-based query.

Details

International Journal of Web Information Systems, vol. 14 no. 3
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 12 May 2023

Chang-Sup Park

This paper studies a keyword search over graph-structured data used in various fields such as semantic web, linked open data and social networks. This study aims to propose an…

Abstract

Purpose

This paper studies a keyword search over graph-structured data used in various fields such as semantic web, linked open data and social networks. This study aims to propose an efficient keyword search algorithm on graph data to find top-k answers that are most relevant to the query and have diverse content nodes for the input keywords.

Design/methodology/approach

Based on an aggregative measure of diversity of an answer set, this study proposes an approach to searching the top-k diverse answers to a query on graph data, which finds a set of most relevant answer trees whose average dissimilarity should be no lower than a given threshold. This study defines a diversity constraint that must be satisfied for a subset of answer trees to be included in the solution. Then, an enumeration algorithm and a heuristic search algorithm are proposed to find an optimal solution efficiently based on the diversity constraint and an A* heuristic. This study also provides strategies for improving the performance of the heuristic search method.

Findings

The results of experiments using a real data set demonstrate that the proposed search algorithm can find top-k diverse and relevant answers to a query on large-scale graph data efficiently and outperforms the previous methods.

Originality/value

This study proposes a new keyword search method for graph data that finds an optimal solution with diverse and relevant answers to the query. It can provide users with query results that satisfy their various information needs on large graph data.

Details

International Journal of Web Information Systems, vol. 19 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 14 April 2014

Chang-Sup Park and Sungchae Lim

The paper aims to propose an effective method to process keyword-based queries over graph-structured databases which are widely used in various applications such as XML, semantic…

Abstract

Purpose

The paper aims to propose an effective method to process keyword-based queries over graph-structured databases which are widely used in various applications such as XML, semantic web, and social network services. To satisfy users' information need, it proposes an extended answer structure for keyword queries, inverted list indexes on keywords and nodes, and query processing algorithms exploiting the inverted lists. The study aims to provide more effective and relevant answers to a given query than the previous approaches in an efficient way.

Design/methodology/approach

A new relevance measure for nodes to a given keyword query is defined in the paper and according to the relevance metric, a new answer tree structure is proposed which has no constraint on the number of keyword nodes chosen for each query keyword. For efficient query processing, an inverted list-style index is suggested which pre-computes connectivity and relevance information on the nodes in the graph. Then, a query processing algorithm based on the pre-constructed inverted lists is designed, which aggregates list entries for each graph node relevant to given keywords and identifies top-k root nodes of answer trees most relevant to the given query. The basic search method is also enhanced by using extend inverted lists which store additional relevance information of the related entries in the lists in order to estimate the relevance score of a node more closely and to find top-k answers more efficiently.

Findings

Experiments with real datasets and various test queries were conducted for evaluating effectiveness and performance of the proposed methods in comparison with one of the previous approaches. The experimental results show that the proposed methods with an extended answer structure produce more effective top-k results than the compared previous method for most of the queries, especially for those with OR semantics. An extended inverted list and enhanced search algorithm are shown to achieve much improvement on the execution performance compared to the basic search method.

Originality/value

This paper proposes a new extended answer structure and query processing scheme for keyword queries on graph databases which can satisfy the users' information need represented by a keyword set having various semantics.

Details

International Journal of Web Information Systems, vol. 10 no. 1
Type: Research Article
ISSN: 1744-0084

Keywords

Article
Publication date: 6 September 2016

Collins Udanor, Stephen Aneke and Blessing Ogechi Ogbuokiri

The purpose of this paper is to use the Twitter Search Network of the Apache NodeXL data discovery tool to extract over 5,000 data from Twitter accounts that twitted, re-twitted…

3558

Abstract

Purpose

The purpose of this paper is to use the Twitter Search Network of the Apache NodeXL data discovery tool to extract over 5,000 data from Twitter accounts that twitted, re-twitted or commented on the hashtag, #NigeriaDecides, to gain insight into the impact of the social media on the politics and administration of developing countries.

Design/methodology/approach

Several algorithms like the Fruchterman-Reingold algorithm, Harel-Koren Fast Multiscale algorithm and the Clauset-Newman-Moore algorithms are used to analyse the social media metrics like betweenness, closeness centralities, etc., and visualize the sociograms.

Findings

Results from a typical application of this tool, on the Nigeria general election of 2015, show the social media as the major influencer and the contribution of the social media data analytics in predicting trends that may influence developing economies.

Practical implications

With this type of work, stakeholders can make informed decisions based on predictions that can yield high degree of accuracy as this case. It is also important to stress that this work can be reproduced for any other part of the world, as it is not limited to developing countries or Nigeria in particular or it is limited to the field of politics.

Social implications

Increasingly, during the 2015 general election, citizens have taken over the blogosphere by writing, commenting and reporting about different issues from politics, society, human rights, disasters, contestants, attacks and other community-related issues. One of such instances is the #NigeriaDecides network on Twitter. The effect of these showed in the opinion polls organized by the various interest groups and media houses which were all in favour of GMB.

Originality/value

The case study the authors took on the Nigeria’s general election of 2015 further strengthens the fact that the developing countries have joined the social media race. The major contributions of this work are that policy makers, politicians, business managers, etc. can use the methods shown in this work to harness and gain insights from Big Data, like the social media data.

Details

Program, vol. 50 no. 4
Type: Research Article
ISSN: 0033-0337

Keywords

1 – 4 of 4